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Abstract 

An azimuthal angle correlation between the two hardest jets is studied in the tt production process 
at the 14 TeV LHC. The event samples are generated by merging the tree level matrix elements for the 
tt plus up to 2 or 3 partons with parton showers. The generated event samples show a strong correlation 
in the azimuthal angle difference between the two hardest jets, as predicted in the analysis based on the 
tree level matrix elements for the tt + 2 partons. The effects of merging the matrix elements for the tt + 3 
partons on the correlation are studied in detail. It is found that they play important roles in improving 
the prediction of the correlation. 
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1 Introduction 


Since the discovery of the Higgs boson was announced in the summer of 2012, the LHC measurements of its 
properties have so far been supporting the standard model (SM) predictions HI HI SI 11] • The Higgs sector 
of the SM respects the charge-conjugation and parity (CP) symmetry and the Higgs boson should be CP 
even. Therefore if an admixture of the CP odd component is observed, it will be a direct evidence of CP 
violation in the Higgs sector and thus physics beyond the SM. 

From the analyses on the tree level matrix elements, it has been shown that the azimuthal angle 
difference between the two partons (gluon, quarks or antiquarks) produced in association with the Higgs 
boson produced by gluon fusion is very sensitive to the CP property of the Higgs boson 0 Eld El- Several 
analyses including effects of higher order corrections show that the correlation between the two partons 
found at the tree level matrix elements can be observed as a correlation between the two hardest jets despite 
smearing, see e.g. refs. 0 nm m E2]. However the attempts to observe the CP odd admixture precisely 
in this approach are expected to be difficult due to large theoretical uncertainties in Monte Carlo event 
simulation, particularly in our use of a parton shower generator which can simulate the QCD radiation only 
in the soft and/or collinear limit. 

It has been pointed out in ref. |13| that the two partons produced in association with a top quark pair 
has a large azimuthal angle correlation near the threshold rn f T ~ 2 m t and the correlation is similar to that 
of the two partons produced together with the CP odd Higgs boson via gluon fusion. The claim of ref. [T5] 
is that experimental techniques to measure such an angular correlation between jets can be established first 
by using these SM processes which have large cross sections. More precisely, we measure the azimuthal 
angle difference between two jets produced in association with a top quark pair and tune a Monte Carlo 
event generator to reproduce the data quantitatively. If an event generator tuned in this way is used, the 
theoretical uncertainty on the prediction of the azimuthal angle correlation between two jets produced in 
association with the Higgs boson can be reduced significantly. This will help achieve accurate measurements 
of the CP property of the Higgs boson. 

In the present paper we attempt to create a bridge between the proposal of ref. [13] and actual 
experimental measurements, by studying our present capability and limitation of simulating the top quark 
pair plus multi-jet production process, so that experimentalists can use the real data to improve our 
simulation tools to be used to probe more fundamental physics such as the CP property of the Higgs boson. 
The simplest method to include leading higher order corrections to the top quark pair production is to 
apply a parton shower generator to the exclusive top quark pair events, where the parton shower scale 
evolution of the top quark pair events produces the top quark pair plus multi-jet events at a hadronization 
scale. The event samples generated in this way are expected to reproduce qualitatively the multi-jet event 
rates and the jet p T and rapidity distribution, since the successive emission of parton showers follows the 
QCD prediction in the soft and/or collinear region and the overall jet rates have been fitted to the data 
in e + e^ and hadronic collisions. Those events, however, do not have correct correlations among jets since 
a parton shower generator emits azimuthally symmetric radiation about a parent momentum direction. To 
reproduce azimuthal angle correlations between two jets, at least the tt + 2 partons matrix elements have to 
be embedded. In order to consistently combine the event samples for different parton multiplicity generated 
by tree level matrix elements with parton showers, a tree level merging algorithm is required. There exists 
several tree level merging algorithms proposed in literatures, including the CKKW (14, 15 [l6l [17], the 
CKKW-L [151 HU HU- the GKS [21] . the MLM [21 H3]- the pseudo shower [21] and the shower k ± [2o] 
algorithms. Comparisons between these different algorithms have also been studied [21 HU HU [57]. Our 
objective in the present paper is to implement tree level merging algorithms and study theoretical issues on 
predicting the azimuthal angle correlation between the two hardest jets. 

We generate the event samples for the top quark pair production at the 14 TeV LHC, by merging the 
tree level matrix elements for the tt plus up to 2 or 3 partons with the parton shower model in PYTHIA8. 
By using the generated event samples, the azimuthal angle difference between the two hardest jets (i.e. the 
two highest transverse momentum jets), A<j> = (f> 1 — 4> 2 , is studied. 
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As tree level merging algorithms, the CKKW-L merging algorithm [H Q3E .21] and a new tree level 
merging algorithm are implemented. Our new algorithm differs from the CKKW-L algorithm in the strategy 
for phase space separation. It is designed so that the contribution from the tt + 0,1 parton matrix elements 
to the event samples with two or more jets, which we call the contamination, can be more suppressed above 
the merging scale. Therefore, more accurate predictions on correlations between two jets are expected. 
We confirm this by numerically comparing the two algorithms. The contamination is studied by varying a 
relation between the merging scale and the scale of a jet definition. We find that the contamination is not 
negligible when the merging scale is set equal to or slightly smaller than the scale of the anti-fc T jet definition. 

We produce the A<p distribution by using the generated event samples. The distribution shows a strong 
correlation in A</>, as predicted in the previous analysis |13| based on the tt + 2 partons tree level matrix 
elements. This observation confirms that the correlation found in the tt + 2 partons tree level matrix 
elements m is still visible after including the dominant QCD higher order corrections and thus can be 
observed in the experiments. 

We observe a clear difference in the A cf> distribution between the event samples generated by merging 
the matrix elements for the tt plus up to 2 partons and those generated by merging the matrix elements for 
the tt plus up to 3 partons. Furthermore, the difference is found slightly larger, when the rapidity range 
for jets is more restricted. We study the effects of the tt + 3 partons matrix elements on the correlation 
in detail and find out the origins of the difference. We show that the tt + 3 partons matrix elements play 
important roles in predicting A (j) accurately. 

We present a method for merging the matrix element event samples which include a tt decay as a part 
of the hard process with the parton shower. In this method, correlations between the decay products of the 
tt are predicted correctly, while a merging algorithm is performed as if the tt was not decayed Q The effect 
of the tt dilepton decay on the azimuthal angle correlation is studied. We find that the effect is small, when 
the two hardest jets are picked up from all jets not including the two hardest b jets. 

We note in passing that this is not the first attempt to estimate higher order corrections to the azimuthal 
angle difference between the two partons produced in association with a top quark pair. The azimuthal 
angle difference between two jets in the tt production process has been studied with the aim of a scalar top 
quark search in ref. [28], of a gluino search in ref. [29] and of investigating top quark mass effects in the 
effective Higgs-gluon coupling in refs. [SO] and [31] • 

In Section [2] our implementation of the tree level merging algorithms is described in detail. In Section [3] 
the azimuthal angle correlation is studied. In Section [4] we summarize our findings. 


2 Implementation of merging algorithms 


Our implementation of tree level merging algorithms is described in this section. In Section |2.1[ the basic 
idea of tree level merging algorithms m is reviewed. Our notations used throughout the paper are also 
introduced. In Section 2.2 the CKKW-L merging algorithm [ 111121120 ! is reviewed at first. After that, 


we introduce a new tree level merging algorithm, which differs from the CKKW-L merging algorithm in the 
strategy for phase space separation. The procedure of constructing the PYTHIA8 parton shower history is 
described in Section |2.3| and that of calculating the weight function is explained in Section |2.4| These are 
ingredients of the merging algorithms. A method for consistently merging the matrix elements event samples 
which include decays of the top and antitop quarks as a part of the hard process is presented in Section [275] 
In Section |2.6[ a procedure for the event generation of the top quark pair production and setups for it such 


as scale choices are explained. Our implementation is carefully tested in Section 2.7 


1 This is necessary, since QCD radiation off a top quark takes place faster than its decay. 
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2.1 Tree level merging algorithms 

In this section, the basic idea of tree level merging algorithms m is reviewed. Our notations used 
throughout the paper are also presented. 


Let us start with the DGLAP evolution equation [32;, I33| !34] with the Sudakov form factor j35T 


d q(x, t) _ r {t) dza s p q(x/z,t) 

dt A[t) J Q z 27T qq{Z} A(t) 


where the Sudakov form factor is given by 


A (t) = exp 



dt' 

~v 



dz-^P qq (z) 


( 2 . 1 ) 


( 2 . 2 ) 


Here only the quark parton distribution function q(x , t) (PDF) and the splitting function without the virtual 
correction P qq {z) for q —»• qg are introduced in order to simplify our writing. The generalization is, however, 
simple. Integrating eq. (2.1) over t A < t < t x gives 

<l(x,t x ) <l(x,t A ) 


A(ijc) 

where a short hand notation is used 


A (t, 


* x dt f c{t) q{x/z,t) 

, 7 l 


7^ / \ dz x , \ 

d p q J z ) = p 99 ( 2 )- 


After the infinite iterations of eq. (2.3), it is found that 
q(x,t x ) q(x,t A ) f tx dU f e(tl 


A (t 


xi 


A(i* 


+ / ~r \ dp Jz i) 


q{x/z 1 ,t A ) 


L 1 JO 


tx dt x 


r e (*i) 


T- / dp Jz x ) / — 


qqy u A(t A ) 

4l dt 1 


b l Jo 


t 


e(t 3 ) j. , x q(x/{ Zl z 2 ),t A ) 

2Ja S0 


By dividing this equation by q(x,t x )/A(t x ), we find 


x __ q( x , t A ) A(t x ) + 


** dt , 


q{x,t. x ) A(t A ) 
tx dt x 


e(fl) M Mx/z x ,t A ) A(t x ) 
h Jo Pqq[l) q{x,t x ) A (f A ) 


e (*i) 


1 dtr, 


/ dp™Ol) / TT- 


^1 ^0 


^2 -'O 


£( ‘ 2) ,. , ^/(Ai^Ma) A(t x ) 
dp qq {z 2 )- 


q(x,t 


XI 


A(*a) 


(2.3) 


(2.4) 


(2.5) 


( 2 . 6 ) 


The first term of the right hand side (RHS) in this equation can be regarded as the probability of generating 
no radiation from a quark q{x) in a proton during the scale evolution of the proton between t x and t A 
(t x > t A ). The second term represents the integrated probability of generating exclusively one radiation 
from the quark during the scale evolution, and so on. The left hand side (LHS) ensures the probability 
conservation. 


We write the no radiation probability from a quark q(x) in a proton during the scale evolution of the 
proton between t x and t 2 {t x > t 2 ) in the following form 1351 


n g (i i,t 2 ',x) 


q{x,t 2 ) A(t x ) 
q{x,t x ) A(t 2 y 


(2.7) 
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Then, we can express eq. (2.6) as 


r t x fjf 

i=nJt x ,t A]X )+ 


r e (*i) 


dp qq {z 1 ) 


i Jo 


q{x/zi,h) 

q(x,t i) 


n Jt^t^x/z^ 


dt 1 


«(b) 


(fto 


/ dp qq {z{) / — 


t 


1 -'O 


2 ^0 


dftqq (-2) ] n 9 (i.Y, 1 1 ; x) n, (t!, t 2 ; a:/^) 


«(Mi) 




q(x/z 1 ,t 2 ) q 

H-. 

It is an easy task to guess the explicit form of n ? (t 1 , t 2 ; a;) from the above equation 


n„(ti,t 2 ;a;) = exp ( - f ^ 


dp « {z) ^ir 


( 2 . 8 ) 


(2.9) 


Given the scale and the energy fraction a; of a quark in a proton, eq. (2.8) allows us to generate 
radiations from the incoming quark by evolving the proton from the scale t x . This is known as backward 
evolution 136 } 37 i (35 . 


The above equation in eq. (2.8) derived from the DGLAP equation concerns only radiation from an 
incoming quark in a proton i.e. initial state radiation. Here we generalize the equation to the one which 
can predict radiation from outgoing partons i.e. final state radiation as well as initial state radiation. 
What we should notice for this purpose is that the PDFs play a role in constraining scale evolution of 
the proton, or in other words constraining radiation from the quark in the proton during scale evolu¬ 
tion of the proton. Hence, when radiation from outgoing partons is considered, we replace the PDFs 
with a function which is obtained from the kinematic information of the outgoing partons. The function 
constrains radiation from the outgoing partons, through the energy and momentum conservation for instance. 

We let {p} x+n denotes a complete specification of an event sample consisting of X + n partons [^J The 
information of the two incoming partons is also included. Then, we introduce a function for the evolution of 
a { P}x+n as 


f{z,t;{p}x+n)> 


( 2 . 10 ) 


which constrains the evolution of the {p)x+n at the evolution scale t and the energy fraction 2 . With the 
constraint function, eq. (2.8) can be generalized to 

ft x dt r 1 

1=H (t x ,t A ;{p} x ) + / -r 1 #(-i) n(t x ,* i; {p} x ) /(zj.tjiWx) H(t 1; t A ; {p} x+1 ) 

Jt, Jo 


+ 1 -p- I dp{z 1 )^{t x ,t l -{p} x )f(z 1 ,t l ;{p} x ) 

Jt Jo 


rh dt 

/ 7/ n (* 2 .4;{p}x +2 ) 

J t . t/ 2 J 0 


+ • 


and accordingly 


n (A> {P\x+n) = exp ( - f j f dp{z)f(z , t; { p } x+n ) ), 


( 2 . 11 ) 


( 2 . 12 ) 


2 This expression is inspired by refs. |211138| . 
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which is defined as the no radiation probability for a {p}x+n as a whole, during the scale evolution of it 
between t 1 and t 2 (t-, > t 2 ). In other words, this is the probability that a {p}x+n remains the same during 
the scale evolution. Note that 


d P( Z) = d i^m 

dp(z) = dz^P(z) 


for initial state radiation, 
for final state radiation, 


(2.13) 


where the appropriate splitting function(s) should be used for P(z) according to a branching process 
{p}x+n {p}x+u+ i- F° r initial state radiation, the constraint function f(z, f; {p} Y+n ) always includes 
the PDFs. The soft gluon singularity at z = 1 in the splitting functions will be avoided by introducing 6 
functions in the constraint function. 


Now that we have the integrated form of the DGLAP equation with the Sudakov form factor in our 
notations as eqs. (2.11) (2.12), we discuss the basic idea of tree level merging algorithms m- Let us first 
define the cross section of a hard process producing X by a(X). For example, when X = qq in e + e~ 
annihilation is considered, cr(X) is given by 


cr(e+e qq) = — f dt> qfj ^r\M 


e+e - —>qq \ 


(2.14) 


or, when X = Z in proton proton collisions is considered, cr(X) is given by 


^{PP Z) = / dx 1 / dx 2 q(x 1 ,p F )q(x 2 ,p, F ) 


1 


2sx 1 x 2 


d'bzY,\ M 


qq^Z\ 


(2.15) 


The DGLAP evolution of the hard process can be expressed by multiplying a(X) by eq. (2.11), namely 
cr(X) =a(X) n (t x ,t A -,{p} x ) 

r f x df r 1 

+ cr(x) / - 1 / dpizj n(t x ,t i; {p} x ) /(zi.tpMx) n (t 1 ,t A -,{p} x+1 ) 

Jt> J 0 


r t x dt 

+ cr(X) / -X dp(z 1 )U(t x ,t 1 ;{p}x) f(zi,h;{p}x) 

Jt. z i Jo 


rh ^ /•! 

x / ~r^ / dp(z 2 )H(t 1 ,t 2 ; {p}x+i) n(i2>*Ai {p}x+2) 

Jt. l 2 Jo 


+' 


(2.16) 


The core idea of merging algorithms is to replace the terms constructed by the leading order cross section 

times the universal radiation probability with the exact tree level cross sections. For the second term of the 

RHS in the above equation, for instance, it proceeds as 

Px df r 1 

a(X) -1 / dp{ Zl ) -xrpT + l). (2.17) 

Jt A z i Jo 


The equation in eq. (2.16) has the following expression 

a{X) = a(X) n(^,i cut ;Hx) 

+ a(X + l;{p} x+1 >t cut ) n (t x ,t 1 ;{p}x) n (*D*cut;Wx+i) 

+ a(X + 2; {p} x +2 > tmt) *i! {p}x) Wx) n(t 1; t 2 ; {p} A - +1 ) 

x n (*2 ) ^cut i {rf.Y+ 2 ) 


(2.18) 


where t A is replaced with t cut . The soft and collinear divergences in tree level matrix elements are regularized 
in the following way. By using the definition of the evolution variable t, calculate the minimum value t min 
from {p} x+n and require t min > t. cut . This is expressed as {p} x+n > t cut in the above equation |^J The cut 

3 Here it is assumed that the hard process cross section cr(X) is finite everywhere in its phase space. 
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off scale £ cut is called the merging scale. Notice that the constraint functions in eq. (2.18) are different from 


those in eq. (2.16), since some part of the constraints are already included in the exact tree level cross sections. 


Eq. (2.18) summarizes the basic idea of tree level merging algorithms with our notations. Tree level 


merging algorithms are designed as to improve the DGLAP equation with a help from exact tree level cross 
sections, or in other words exact tree level matrix elements. 


2.2 The CKKW-L algorithm and its extension 

In this section, at first the event generation procedure following the improved DGLAP evolution equation 
in eq. (2.18) by using the CKKW-L merging algorithm pSJ HH 20] is reviewed. Then, we present a new 
merging algorithm, which differs from the CKKW-L algorithm in the strategy for phase space separation. 
Our strategy is introduced with the aim of predicting jet angular correlations more accurately. The 
independence on parton shower starting scale is discussed at the end. 


We will examine each term, one by one, in the right hand side (RHS) of eq. (2.18) in the following. In 
order to generate an event sample according to the probability of the first term 


a{X) LI (t x , i cut ',{p} x ), 


(2.19) 


at first an event sample {p} x is generated with the cross section er(A'). The next step is to calculate 
the Sudakov form factor II(f A , f cut ; {p } x ) [f] The CKKW-L algorithm uses a parton shower generator to 
calculate Sudakov form factors. We execute a shower generator on the {p} x , setting the shower starting 
scale to the t x . If the first evolution scale randomly chosen is higher than the f cut , then we throw away the 
event sample as a whole and go to the next new event sample. This procedure is equivalent to accepting the 
event sample with the probability equal to II(f A ,f cut ; {p} A ), since it is the no radiation probability during 
the scale evolution of the {p} Y between the scales t x and f cut . If the first evolution scale is lower than 
the f cut , this evolution is continued until the cut off scale f had of the shower model and the event sample 
contributes to the inclusive event samples. 


Next, let us look at the second term 

a(X + l;{p} x+1 > f cut ) n(f x ,f i; {p} A -) f {p} x ) n(£ 1 ,£ cut ;{p} A+1 ). (2.20) 

At first an event samples {p} x+1 is generated with the tree level cross section cr(A + 1; {p} X+1 > f cut ). In 
order to calculate the first Sudakov form factor II(£ Y > tp {p}x)> we nee d the intermediate event {p} x and 
the scale t x . In the CKKW-L algorithm, these are obtained from the {p}\-+i by executing a program which 
does the exact inverse of the shower generation of a parton shower generator which is used. The program 
should produce a {p} x and a scale t 1 from the {p} x+1 as if the parton shower generator had evolved 
the {p} x and then had generated the {p} x+ \ with the evolution scale t 1 . This backward flow is often 
called a parton shower history. The construction of the parton shower history completely depends on the 
shower generator which is used. In Section |2.3[ the construction of the PYTHIA8 parton shower history is 
described. For now let us assume that we successfully construct a shower history of the {p} A +i an d obtain 
a {p} Y and a scale t 1 . In order to calculate the first Sudakov form factor, the shower generator is executed 
on the {p } x , starting from the scale as before. If the first evolution scale randomly chosen is higher than 
the t 1 , we throw away the event sample {p} X i\ as whole and go to the next new event sample. If not, the 
constraint function f'(z 1 ,t 1 ]{p} x ) is calculated and the event sample {p} v+1 is re-weighted according to 
it. The calculation of the constraint function is discussed in Section [274} If the event sample is still survived 
after the re-weighting, the second Sudakov form factor II(£ 1 , f cut ; {p } x , ,) is calculated by executing the 
shower generator on the {p} x+1 from the scale t 1 and throwing away the event sample as a whole if the 
first evolution scale is higher than the f cut . If the first evolution scale is lower than the £ cut , this evo¬ 
lution is continued until the cut off scale f had and the event sample contributes to the inclusive event samples. 


4 Strictly speaking, this factor defined in eq. | |2.12| is not identical to the Sudakov form factor defined in eq. < |2.2| 


notations. However, we call it the Sudakov form factor, too. 









a(X + 2) 


*(X) 



Q\ 


cut 



QL 


Figure 1: The vertical axis represents the evolution variable t, the horizontal axis represents some variables in the 
shower model which determine kinematics. The diagonal line indicates the merging scale Q^ut- The left and right 
panels represent the same event {p} x+2 , however their origin can be different. In the CKKW-L algorithm, the 
contribution in the left panel originates from <r(A') and one in the right panel originates from cr(A' + 2). In our 


merging algorithm, the both originate from a(X + 2). See text in Section 2.2 for a detailed explanation. 


The event generation following the third and higher terms in the RHS of eq. (2.18) can be performed in 


the same way. Although merging algorithms can treat tree level cross sections of any number of partons, 
there are limitations in their calculations. If we decide not to calculate cr(AT + 3, {p}x +2 > Q u t) f° r instance, 
what we should do is to remove the last Sudakov form factor n(f 2 ,i cu ti {p}x+ 2) from the third term in the 


RHS of eq. (2.18) [18] . The third term, which used to be the probability of generating a {p} x + 2 exclusively 
above the i cut , is now the probability of generating a {p } v+2 exclusively above a t 2 (> f cut ). The evolution 
equation is closed at the third term, i.e. the dots in eq. (2.18) disappears. 


Up to now it has been assumed that the definition of the merging scale t cut is equivalent to that of 
the shower evolution variable t. It is discussed in refs. (HJ [20] that the definition of the merging scale i cut 
can be arbitrary, as long as it regulates the singularity in tree level matrix elements. We let Q 2 ut denotes 
a definition of the merging scale. This can be the k ± definition of the jet clustering algorithm [501 for 
instance. With an arbitrary definition of the merging scale, the Sudakov form factor in the first term of 


the RHS of eq. (2.18) is calculated as follows. We execute a shower generator on the {p} x , starting from 


the t x , as before. After the first evolution, we obtain a {p}y+i an d calculate the minimum value Q^ in on 
the {p] x+ i by using the definition of the merging scale Q 2 ut . If Qj 2 nin > Q 2 ut , we throw away the event 
sample as a whole and go to the next new event sample. We express this as {p} x+ i > Qcut- ^ not i-e. 
Qmin < QcutJ this evolution is continued until the cut off scale f had and the event sample contributes to the 
inclusive event samples. We express this as {p} x+i < Qcu t- This procedure is applied to the calculation of 


the last Sudakov from factor at each term in the RHS of eq. (2.18). 


However, it can be easily imagined that, even though {p} Y +i < Qcut I s satisfied after the first evolution of 
the {p} x , the second evolution may generate a {p} A - +2 which is {p } x +2 > Qcut- This is considered as double 
counting with the contribution from the third term in the RHS of eq. (2.18). The solution to the double 
counting issue in the CKKW-L algorithm can be explained as follows, by using illustrations in Figure [Up] 


The vertical axis represents the evolution variable t, the horizontal axis represents some kinematic variables 
in a shower model, such as the energy fraction variable z. One point in the figure corresponds to one phase 
space point of a {pIy+ui an< i h i s uniquely determined once a value for the vertical axis and a value for 
the horizontal axis are chosen. The diagonal line indicates the merging scale Q 2 U f Note that this line will 
be perpendicular to the vertical axis, when the definition of the merging scale is equivalent to that of the 


5 This illustration is inspired by refs. 1181 [20 
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evol ution variable i.e. Q^ ut = £ cut . The left panel shows a contribution of the first term in the RHS of 
eq. (2.181 after the first and second evolution. This event sample satisfies {p } x+1 < Q after the first 


,__ rx+i ' ^ cut 

evolution, thus it is already accepted. However, after the second evolution, it becomes {p} 


The 


r.v +2 > Q cut • 

right panel shows a contribution of the third term in the RHS of eq. (2.18) after no evolution. As it is 


illustrated in the right panel, the CKKW-L algorithm requires intermediate events to satisfy the merging 
scale cut, i.e. {p} A -+i > Qcut as well as {p}x+ 2 > Qc U t- The sum of the two contributions in the left and 
right panels does not lead to double counting, since the {p}y+i i n the left panel an d the {p} x+1 in the 
right panel live in the different phase space regions. For the CKKW-L algorithm, eq. (2.18) can be written 
as follows, 


^(T)cKKW-L 

= a(X) {p}x + 1 < Qcut( IpIy) 

+ a^X + 1; {p} x+ i > Qcut) {p}x) f'( z i^i,{p}x) < QcuFWjy+i) 

+ a^X + 2 ; {p} Y+ 2 > Qcut) {aLy+i > Qcut) n(t x , tp, {p} A -) f (z 1 , Q; {p} x ) LI^, £ 2 ; {p}x+i) 

x f{ z 2 i^ 2 'i{p}x+ 1 ) n(^ 2 ) {pIx +3 < Qcut! {p}.Y+ 2 ) 

+ •••• ( 2 . 21 ) 


In this study, we use a different method to avoid the double counting issue. When a purpose of merging is 
to predict angular correlations between two hard jets, it can be one disadvantage of the CKKW-L algorithm 
that the contribution shown in the left panel of Figure[l]is described by cr(X). The {p} Y+2 in the left panel 
has potential to produce a X+ 2 hard jets event. However, angular correlations between the two jets are 
not correct, since this event originates from er(X), not a[X + 2). Therefore, our algorithm is designed so 
that the contribution shown in the left panel originates from a(X + 2), not cr(X). The improved evolution 
equation in eq. (2.18) for our merging algorithm can be written as follows, 


ff (T)cKKW-L+ 

= a (X) n(i A -,{p} A + 1 < Qc U t, {p}.Y+2 < Qcut) " • i My) 

+ a(^X + 1; {p} x+1 > Qcut) nfe.ifWx) f'( z i,h\{p}x) 

x n(ti, {p } A ' + 2 < Qcut) {p}x+3 < Qcut) ■ ■ ■ ) {p}.Y+l) 

+ <j(^X + 2 ; {p} x+2 > Qc U t) {p} x ) f ( z i,ti,{p} x ) H(t 1 ,t 2 ',{p] x+ 1 ) 

x f( z 2i ^2) Mx+l) n (*2) {p}x+ 3 < Qcut) {p\x+A < Qcut) ' ' ‘ ) {p}.Y+2) 

+ ■ ■ • . ( 2 . 22 ) 


Notice that the cut {p}y+i > Qcut in the third term of the RHS of eq. (2.21) now disappears and a cut 
{p } A+ 2 < Qcut is newly added in the first term. Merging scale cuts expressed by the dots in the last Sudakov 
form factor at each term depends on the maximal number of partons predicted by the tree level cross section. 
In order to make the expression clearer, let us consider the case that we do not calculate a\X + 3). The 
evolution equation is given by 


<t (T)ckkw-l+ 

= a(x) n^£ x , {p} x +\ < Qcut) {p}a'+ 2 < Qcut) {p}v) 

+ a ( y X + 1;{pLy+i > Qcut) n(Qc,Q;{ p}.y) f ( z i^iAp}x) n^iiWx+2 < QculIpIy+i) 

+ cr(x + 2 ; {p } x+2 > Qcut) n (t x ,t 1 ;{p} x ) /'(%,Q; {p} x ) n(Q,£ 2 ; M, v+1 ) f( z 2 ,t 2 -,{p}x+i), ( 2 - 23 ) 


where N denotes the maximal number of partons predicted by the tree level cross sections. This equation 
implies that both of the contributions in Figure [l] originate from a(X + 2). Since they never originate from 
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cr(X) or a(X + 1), there is no double counting. For the sake of completeness, we give the evolution equation 
for N = 3, 

<t (^)ckkw-l+ 

= cr(X) {p\x+i < Q cut’ {rf.Y+2 < Q cut ’ {p}x+3 < QcutiiPlx) 

+ a(x + 1; {p} x +i > <9cut) nfedhWx) f'{z 11 t 1 -,{p} x ) 

x n^!,{p}x+2 < Qcut>{p}x+3 < Qcuti {rfx + l) 

+ a (x + 2; {p} x+ 2 > Qcut) n^x.^iWx) f'{ z i^viP}x) l) 

X f{ z 2^2] {p}A'+l)n^2, {p}x+3 < Qcutilrfx-^) 

+ <j(^x + 3; {p}x+3 > Qcut) n(t Y , ti, {p} Y ) f'{zi,h>{p}x) n(ti,t 2 ;Mm) 

x f'( z 2, hi Mx+i) n (^2, *3; {p}x+2)f'( z 3, hi M.Y-+ 2 ) ■ (2.24) 

Notice that a cut {p } x+ 3 < Qcut is newly added in the last Sudakov form factor in the first and second 
terms of the RHS. 


In this paper we call the above algorithm given in eq. (2.22) the CKKW-L+ merging algorithm and use 


it for the event generation. In Section [3i2| the CKKW-L+ algorithm will be numerically compared with the 
CKKW-L algorithm. 


Finally, let us discuss the scale t x . The DGLAP evolution of a hard process a(X) in eq. (2.16) indicates 
that the scale t x is the factorization scale of the hard process. Thus in the merging algorithms the t x 
must be determined from the {p} x which is constructed as a parton shower history from a { p} x +n ■ Since 
we obtain the scales t 1 ,t 2 ,- • • as shower histories from the event samples generated with tree level matrix 
elements, it happens that t x < t 1 ,t 2 ,- ■ ■ ■ This should be considered as an important prediction of the tree 
level matrix elements, since this never happens in the DGLAP shower evolution. Even in such a case, the 
maximal shower evolution scale for the event sample must be set to the t x . Let us suppose the case that 
t\> t x > t 2 > ■ ■ ■. The first Sudakov form factor {p} x ) is obviously unity i.e. no veto. A care is 

needed for the calculation of the second Sudakov form factor. The shower starting scale for the calculation 


x+i 


must be set to 


of the second Sudakov form factor Ll(t 1 , t 2 \{p} x +i) or II(t 1 ,{p} A+2 < Q 
the t x , not the t 1 . The maximal shower evolution scale t x is often called parton shower starting scale. 


2 

cut ’ ’ 


; W 


In the shower evolution without merging algorithms as in eq. (2.16), predictions such as a jet p T distri¬ 
bution can depend strongly on the shower starting scale t x , since it determines hardness of radiation. It 
has been confirmed that this dependence is reduced significantly once we use merging algorithms }25l . The 


reason of the independence can be understood from the improved DGLAP equation in eq. (2.22). Let us 


rewrite eq. (2.24), which is obtained from eq. (2.22) by setting N = 3, in the following form 


Cr inc(-^) — cr exc(^ + 0) + a exc (X + 1) + U exc {X + 2) + <J exc (X + 3) 


or equivalently 


^ _ ^exc 


(X + 0) , a exc (X + l) , a exc (X + 2) , a exc (X + 3) 


+ 


(2.25) 


(2.26) 


incffl ^incPO Vinci*) CT inc W 

First of all, the hardness of radiation parametrized by the evolution variable t as t x , t 2 , • • ■ is determined from 
the tree level matrix elements, thus is nothing to do with the t x . Since the first Sudakov form factor at each 
term such as II(f Yl G) {pIx) an d II (t x , {p} x+i < Qcuti''' >{p}x) depends on the t x , the exclusive cross 
sections <7 exc (X + i) and accordingly the inclusive cross section cr inc (X) are affected by the t x - However, 
because the first Sudakov form factor can be considered as an overall factor, the ratio of the exclusive cross 
section to the inclusive cross section, i.e. cr exc (X + i)/a mc [X), will be little affected. Stable distribution 
can be expected from the above two reasons. The dependence on the shower starting scale is numerically 
evaluated in Section [2771 
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clustering 


Figure 2: Illustrating that an initial state shower model evolves a process b'r' —¥ X' and then generates a process 
ar —► Xc (from right to left), and its inverse (from left to right) 


2.3 Construction of the PYTHIA8 parton shower history 

In Section |2.2| it is explained that the CKKW-L merging algorithms require the construction of the 
parton shower history and a construction program must do the exact inverse of the shower generation 
of a parton shower generator which we use. In this section, the construction of the PYTHIA8 par- 
ton shower history is described. A parton shower history is constructed by successively clustering two 
partons into one parton, and one history consists of a set of intermediate events with the correspond¬ 
ing clustering scales which are ordered. For instance, a history of an event sample {p} x+n consists of 

Wx+(n-ipMx+(n-2)>;" - Wx+i.'" .Mi+lWx with < C-i < ' < h+i < ' '; < h < h- Be¬ 

cause the detailed definition of the evolution variable and the kinematics construction are different for initial 
state radiation (ISR) and final state radiation (FSR) in PYTHIA8, the clustering procedure is also different 
for a clustering of incoming and outgoing partons and that of two outgoing partons. The former procedure 
is described in Section [2.3.1| and the latter in Section [2. 3. 2[ In these sections, we use the knowledges and the 
notations given in the original publications @ 011111112 ] for the shower model in PYTHIA8. Some technical 
details in our implementation are summarized in Section [2. 3. 3| 


2.3.1 Construction of the ISR history 

Let us suppose that an incoming parton a and an outgoing parton c in a process ar —► Xc are clustered 
into a new parton b and hence an intermediate process b'r' —» X' together with a clustering scale p^ clus is 
produced. This is illustrated in Figure [2] (from left to right). The clustering has to proceed as if the initial 
state shower model in PYTHIA8 had evolved the hard process b'r' —> X' and then had generated the process 
ar —► Xc with the evolution scale p_L evol = Pj_ clus . The clustering scale p_|_ clus is derived from 


Pb=Pa~Pc - 

(2.27a) 

_ _ m br _ ( Pb+Pr ) 2 
mlr C Pa+Pr ) 2 ’ 

(2.27b) 

PUclus = -(! ~ z){Pb) 2 - 

(2.27c) 


Here the z can be interpreted as the energy fraction E b /E a in the center of mass frame of proton proton 
collisions. 

The new incoming parton b after the clustering is not moving along the z-axis and it is a spacelike particle 

i.e. (p b ) 2 < 0. Thus we need to make the b on-shell (massless) and moving along the z-axis. The X and X' 
denote all the other particles in the final state, hence X = X' = tt + g for instance. The four-momenta of 
the b' , r' and X' are derived as follows. 

1. Read the azimuthal angle cj> c of the c. 

2. Rotate the c and A' in azimuth by —<j) c . 

3. Calculate the four-momentum of the b as p b = p a — p c . 

4. Boost the &, r and X to the b + r rest frame, and then rotate them in polar angle so that the b and r 
move along the z-axis. 
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Figure 3: Illustrating that a final state shower model evolves the set of partons a' and r' and then generates partons 
b, c and r (from right to left), and its inverse (from left to right). 


5. Rotate the X in azimuth by +<p c . 

6. Newly obtain the four-momenta of massless incoming partons b and r in the b + r rest frame, i.e. 
Pb = (jn br / 2, 0,0, m br / 2) and p r = ( m br /2 , 0,0, ~m br / 2). 

7. Boost the b, r and X along the z-axis so that the r has its original momentum. 

With the above algorithm, the non-zero transverse momentum of the parton b is translated into the kinematics 
of the X. As required, the b 1 is on mass shell (massless) and it moves along the z-axis. The kinematics of the 
r does not change i.e. p r = p r ,. The above algorithm is carefully tested as follows. We apply the algorithm 
to an event ar —> Xc which has been generated by the PYTHIA8 initial state evolution a —► be of a hard 
process event b'r' —> X ', and then we confirm that the algorithm correctly reproduces the event b'r' —> X 1 . 


2.3.2 Construction of the FSR history 

Let us consider the case that there are two outgoing partons b and c and one parton r which is either 
incoming or outgoing. Then let us suppose that the b and c are clustered into a new outgoing parton a and 
hence the set of partons a' and r' together with a clustering scale Pj_ clus is produced. This is illustrated in 
Figure [3] (from left to right). The clustering has to proceed as if the final state shower model in PYTHIA8 
had evolved the set of the partons a 1 and r' and then had generated the partons b, c and r with the 
evolution scale p_|_ evol = Pj_ c i U s- The parton r may not be uniquely determined. In our algorithm, the r 
is randomly chosen. The new outgoing parton a after the clustering is off mass shell. Thus we need to 
put the a on mass shell. There are two approaches depending on whether the parton r is outgoing or incoming. 


When the r is an outgoing parton, the parton a is put on mass shell by giving the four-momentum of 
the parton a to the parton r. The four-momentum of the a + r is kept unchanged i.e. p a + p r = p a , + p r >- 
The kinematics of all the other partons indicated by X in Figure [3] including the incoming partons will not 
be affected. The four-momenta of the partons a' and r' are derived as follows. 


1. Boost the a and r to the a + r rest frame, p a 0 and p r0 . 

2. In the a + r rest frame, calculate the energies and the absolute values of the momenta of the a and r 
which are put on mass shell with the on-shell masses m a and m r , 

\2 


m lr = (Pafi +Prfi)' 2 ’ 

™ 2 ar + ml ~ ml 

ar 

^ar ~ ™ 2 a + 

2rn„r 


I Pa 


E„ 


E„. 


= \Pr 


= . E 2 -• 


(2.28a) 

(2.28b) 

(2.28c) 

(2.28d) 


3. Modify the magnitudes of the momenta of the a and r to the \p a new | and \p r new |, respectively, while 
the directions of the momenta are kept unchanged, 


Pa ,o 

p a = yw—r| P a , n 


I P, 


a, 01 


- _ P r, 0 | - 
Pr .z? | I Pr,n 


\P. 


(* = 1,2,3) 


(2.29) 


r,0 I 
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4. Boost these back to the original a + r frame. 


When the r is an incoming parton, the parton a is put on mass shell by reducing the four-momenta of 
the a and r, while the four-momentum p r — p a is kept unchanged. Thus, the four-momentum p a + p r will 
not be conserved in this case. The four-momenta of the partons a' and r' are derived from 


r'r i -i 

a = — = 1 or - 1, 

h r 

» ( V (Pa) 2 - m l n n (p (Pa ) 2 - m 2 a ' 

V 2(S 0 -ap3) V 2 {E a -apl) / 

Pa' = ( E a + E r’ ~ E r, pI, pI , pI + Pr' - Pr) ■ 


(2.30a) 

(2.30b) 

(2.30c) 


For either case, the clustering scale p ±clus is derived from, 

Pa =Pb +Pc’ 

Po =Pa> +Pr '> 

= Po • Pb f l _ Po-Pc _\ 

Po-Pa\ Po-Pb 2 Po-Pa + rn%-m 2 a -p%)’ 
picius = ~(! - z){{p a ) 2 - ml), 

where m a and m r are the on-shell masses of the partons a and r, respectively. Notice that the p 0 is 
constructed from the four-momenta of the partons a' and r' , which are obtained from the above algorithm. 
The mass effect is taken into account. This is particularly relevant in the clustering which includes a top 
quark. Our algorithm is carefully tested as follows. We apply the algorithm to a process which has been 
generated by the PYTHIA8 final state evolution of a hard process, and then we confirm that the algorithm 
correctly reproduces the hard process. 


(2.31a) 

(2.31b) 

(2.31c) 

(2.31d) 


2.3.3 Some technical details 

In this section, we write some technical details in our implementation of the shower history construction: 

• The clustering 2 —► 1 must respect the QCD 1 —> 2 vertices. 

• When there are more than one candidates for a clustering pair at a clustering step, the one which has 
the lowest clustering scale is always chosen^ 

• Sequential clustering scales are required to be ordered, that is, a clustering scale at a clustering step is 
required to be higher than the scale at the previous clustering step. However, it is not required that 

^ ^1) ^2) ‘ ‘ ‘ • 

• If the hard process {p} x cannot be obtained at the end of sequential clusterings, we take the following 
approach. Let us assume the case that the {p} x n °t obtained from a {p} x+n after sequential n times 
clusterings. The program for the shower history construction is executed again on the {p} x +n' ^rh 
this time a clustering pair whose clustering scale is not the lowest but the second lowest is chosen at 
the first clustering step. If the {p} A - is not obtained yet, the program is executed again on the {p} x + n 
and a clustering pair whose clustering scale is the third lowest is chosen at the first clustering step, 
and so on. If this approach still does not help, the shower history construction for the event sample is 
abandoned. 

6 A more sophisticated approach is proposed in ref. m- 
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2.4 Weight functions 


In eq. (|2.10l of Section 
shower evolution of the 


2.1 


we have introduced a function f(z,t;{p} x+n ) which constraints the DGLAP 
-P}x+n a t the evolution scale t and the energy fraction z. In the improved DGLAP 
equation, some part of the constraint is already included in the tree level cross sections. This has been 
implied by using f (z,t; {p} x+n ) instead of f(z,t;{p} x+ „) in eqs. (|2.18l or (2.22). To make it clear, let 
us write down the second term in the right hand side (RHS) of the DGLAP evolution of the hard process 
cr(X) in eq. (2.16), which gives the integrated probability of exclusively generating one radiation during the 
evolution between t x and t A , 

f 1 I" 1 f tx dt I" 1 

/ dx r dx 2 g(x 1 ,t x )g(x 2 ,t x )a(gg X;s = x 1 x 2 s) / —M dp(^ 1 )/(^ 1 , h; M x ) , (2.32) 

Jo Jo Jt A t i Jo 

where now the hard process is explicitly written and the Sudakov form factors are omitted. This term is 
improved with a help from the tree level cross section 

/ dx A dx 2 g(x 1 ,t 1 )g(x 2 ,t 1 )d(gg ^ X + g-s = x 1 x 2 s)f(z 1 ,t 1 ;{p} x ), 

J 0 Jo 

where the merging scale cut {p} x+g > Q c U t is implicit. Furthermore, we write this as 

/ dx 1 / dx 2 g(x 1 ,t A )g(x 2 ,t A )a(gg -> X + g;s = x 1 x 2 s)f'(z 1 ,t 1 -,{p} x )f'{t A ;{p} x+g ). 

Jo Jo 


(2.33) 


(2.34) 


A fixed scale t A is now used in the PDFs and this change is included in the function /'(f A ; {p}a'+ 9 )- We 
call f'(z,t ; {p} Y + n ) a weight function hereafter, since it is used to re-weight an event sample in the merging 
algorithms, see the discussion below eq. (2.20) for the detail. The first task in this section is to derive the 
weight function f(z 1 , ; {p} x ) explicitly. 

There are three radiation patterns in the PYTHIA8 parton shower, namely final state radiation with an 
outgoing recoiling parton, initial state radiation with an incoming recoiling parton and final state radiation 
with an incoming recoiling parton. This is also discussed in Section |2.3| For the first case i.e. final state 
radiation with an outgoing recoiling parton, the kinematics of the incoming partons will not be changed. 
Hence eq. (2.32) is evaluated as 

[ dx 1 [ dx 2 g{x 1 ,t x )g(x 2 ,t x )d(gg ^ X;s = x 1 x 2 s) f y 1 / dp{z 1 ) 

Jo Jo Jt A l l Jo 

= dx 1 dx 2 g{x x , t x )g(x 2 , t x )d(gg -» X + g;s = x^s) 

Jo Jo 


= [ dx 1 f dx 2 g{x 1 ,t A )g(x 2 , t A )a(gg -> X + g\ s = x x x 2 s) 9 ^ 1 ' t ^ 9 ^ 2 ’ ^ , 
Jo Jo 9\ x iDaI 9\ x 2 i 1 a) 


thus we obtain 


or equivalently 


/'(h.<i;Mx)/'(*A;Wx+ 9 ) = 


9( x i,t x ) g(x 2 ,t x ) 


9 (x 1; t A ) g(x 2 ,t A )' 

f/( ^ . c i t _ g(- T i^i) 9 ( x i,t x ) g(x 2 ,t x ) g(x 2 ,t 1 ) 

11 11 A A ’ X+9 g{xi,t A ) g(xi,t 1 ) g{x 2 , t ± ) g{x 2 ,t A Y 


(2.35) 


(2.36) 


(2.37) 


For the second case i.e. initial state radiation with an incoming recoiling parton, the kinematics of a 
radiating incoming parton will be changed. Thus the weight function should include the parton distribution 
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functions (PDFs). The PDF factor is explicitly given in eq. (2.8). When we assume that the incoming parton 
which has the energy fraction x 1 is the radiating parton, eq. (2.32) is evaluated as 


x dt i I' 1 \ 9(xi/z 1 ,t. 1 ) 

9{ x i,h) 


[ dx 1 [ dx 2 g(x 1 ,t x )g(x 2 ,t x )a(gg ->■ X;s = x^s) [ ^ [ dp(zO 

Jo Jo Jt A Jo 

= [ dx 1 f dx 2 g{x 1 ,t x )g(x 2 ,t x )a(gg ->■ X+g;s = {x 1 /z l )x 2 s) — 

Jo Jo Z 1 


1 g(x 1 /z 1 ,t 1 ) 


9(xi,t i) 


/* 1 /* 1 q(tjj £ ^ 

= / dw x dx 2 g(z 1 w 1 ,t x )g(x 2 ,t x )a(gg -> X + g-s = w 1 x 2 s )-^— ldJ — 
J o do 9 \ z i w i>ti) 

= [ dw i [ dx 2 g(w 1 ,t 1 )g{x 2 ,t 1 )a(gg X + g;s = w 1 x 2 s) 9 ^ lWl,t * ' > 

Jo Jo gi z iu> 1 ,t 1 

= dw 1 dx 2 g{w 1 ,t A )g(x 2 ,t A )d(gg X+g;s = w x x 2 s) 

Jo Jo 


g(x 2,h) 

gjw^tQ g(z 1 w 1 ,t x ) g{x 2 ,t x ) g{x 2 ,t 1 ) 
g(wi,t A ) giziw^t-0 g(x 2 ,t 1 ) g(x 2 ,t A )’ 

(2.38) 


thus we obtain 


* . r_i \ sU-i . \ g( w i’ti) 9(ziWi,t x ) g(x 2 ,t x ) g(x 2 ,t 1 ) 

f (*.><.. MW ('*.»«,) = 9(roi , (A ) 9(2l „ ll(l) 9fe , (l)9fe ,t A )- (2 ' 39) 


Note that l/^ at the second line of eq. (|2.38[) comes from dp(z 1 ), see eq. (2.13). 


For the third case i.e. final state radiation with an incoming recoiling parton, the kinematics of the 
incoming recoiling parton will be changed. Thus the weight function includes the PDFs [41] . When we 


assume that the incoming parton which has the energy fraction x 1 is the recoiling parton, eq. (2.32) is 
evaluated as, by letting w 1 denotes the energy fraction of the recoiling parton after the radiation, 

f 1 f 1 , „ \ f* x dti f 1 ,wi giwuti) 

/ dx ± / dx 2 g{x 1 ,t x )g{x 2 ,t x )a{gg -» A;s = x^s) / — / dp^J— — -— 

Jo Jo Jt A r i do x i 9 \ x lTD 

f 1 , f 1 . „, „ i f tx dti f 1 . g(w-,,t i) 

= / drtq / dx 2 g{x 1 ,t x )g{x 2 ,t x )a(gg -> A;s = aqa^s) / — / dp(iq) —-— 

do do d* A do g(^i,ii) 


1’ ''It 


f 1 f 1 f tx dt f l i 

= dw 1 dx 2 g(w 1 ,t 1 )g(x 2 ,t x )a(gg-> X-,s = x 1 x 2 s) / - 1 / dp(^)- 

do do dt A r i do < 71 ^ 1 1 r id 

f 1 f 1 g(. x it ) 

= / dwq / dx 2 g(w 1 ,t 1 )g(x 2 ,t x )a(gg X +g;s = w^s) 1: ; x 

Jo Jo 9\ x ii t i) 

f 1 , f 1 , , , w . w - n gOi,^) t x ) g{x 2 ,t x ) g(x 2 ,t x ) 

= duq / dx 2 g(w 1 ,t A )g(x 2l t A )a[gg X + g; s = w 1 x 2 s) — -——- 7 V -7 -few-w, 

do do gwh,t A ) g(aq,ti) g(® 2 ,ti) g{x 2 ,t A ) 


thus we obtain 


(2.40) 


W- * . r„i w/^.Li t dK,ti) g(a; 1; t A ) g{x 2 ,t x ) g(x 2 ,t 1 ) ^ 

f (*..<. ,Wx)/ (<A,{P}^») - 9( . 9 ,, (a) 9fe , (l)9 fe, tA )^ (2 ' 41 ) 


By looking at the weight functions of the three radiation patters in eqs. (2.37), (2.39) and (2.41), the 


general expression can be found as follows. Let us suppose an event sample {p} x+n , which gives a shower 
history {p} x+(n _ iv {p} x+{n _ 2) ,- ■ ■ , {p} x+l , • ■ • , Mx+i> Ma with K < t„_i < • < t i+i <■■■ <t 2 <t v 

Let us also define the energy fractions and the parton types of the incoming partons in the {p} x+i by x^\ 
x 2 ^ and f[‘\ f 2 \ respectively. The weight function for the {p} x+i is given by [20] 

a s( t i+ 1 ) fl\ x l\ti) 


f'{z i+1 ,t i+1 ;{p} x+i ) = 


a) f 2 \x { 2 \t i+ 1 )’ 


(2.42) 
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Figure 4: Schematic pictures showing that two different shower histories are constructed for an event gg —> tig. In 
the left picture, the clustering t + g —> t, which gives the lowest clustering scale t 1 of all the candidates for a clustering 
pair in the event, is performed. The right picture shows the case that the clustering t + g —>■ t is not allowed in the 
program. A different clustering which gives a higher scale t\ than the t 1 will be performed instead for the same event. 


from which the total weight function for the {p}x+n is 




(2.43) 


i=0 


where t 0 = t x and t n+1 


= t , 


It is assumed that the scale t A is used as the scales of the PDFs and those of 


the strong couplings for generating the {p} Y + n - A factor consisting of the strong couplings is present in the 
function, because the evolution scale t is used as the scale in the strong coupling in the PYTHIA8 parton 
shower evolution and thus this strategy should be also used in the improved evolution equations |14j . We 
take into account the fact that The PYTHIA8 shower model uses the different strong couplings for initial 
state radiation and final state radiation. When a radiation is classified as the initial state radiation (the final 
state radiation) with a scale t i+1 by the shower history construction, a s (m z ) = 1.37 (1.383) is used for the 
factor in eq. (2.42). It can be easily confirmed that the results in eqs. (2.37), (2.39) and ( 2.41| ) are derived 
from eq. (2.43) by setting n = 1. 


2.5 The merging algorithm with top decays 

The decay of the top quark is characterized by its decay width T(t —>■ bW + ) ~ 1.5 GeV. The large scale 
discrepancy between the decay width and the production scale of a top quark (~ m t ) indicates that QCD 
radiation off a top quark takes place faster than the decay of the top quark. Although gluon radiation off a 
top quark is suppressed by its large mass, it can be important for more accurate predictions. The parton 
shower generator in PYTHIA8 models gluon radiation off heavy particles including the top quark JF2J. 
Therefore, the construction of the PYTHIA8 parton shower history also has to take into account the 
clustering which includes a top quark i.e. t + g —» t. 


Let us first examine the case that the clustering t + g —» t is not implemented in the program for 
the shower history construction. Figure [4] presents schematic pictures showing that two different shower 
histories are constructed for an event gg -+ tig. In the left picture, the clustering t + g -+ t which gives 
the lowest clustering scale t 1 of all the candidates for a clustering pair in the event is performed. If the 
clustering t. + g —> t is not implemented in the program, a different clustering which gives a higher scale 
t'i than the t 1 will be performed for the same event, as illustrated in the right picture. This difference in 
the parton shower history affects the calculation of the Sudakov form factors in the evolution equation in 


eq. (2.22) and thus can induce some problems such as the unnecessary dependence on the merging scale. 


The clustering t + g —> t in the construction of the PYTHIA8 parton shower history is necessary, no 
matter whether top quarks are decayed in simulation, since the decays occur after gluon radiation off the 
top quarks as discussed above. The simplest approach to simulate the ti production including a ti decay 
may be as follows. At first, the event samples of the ti production process are generated with the merging 


algorithms i.e. X = ti in eq. (2.22), by assuming that the ti is stable. Then, the top and antitop quarks are 
decayed independently following the differential partial decay width, dT(t —> l + vb ) or dT(t -+ qq'b). QCD 
radiation off the decay products will be generated at the end. In this simple approach, correlations between 
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Figure 5: left: An event sample {p} b i+ I/ j,i-^ +g reconstructs a { p}tt+ g as an intermediate process, at the first allow. 
A shower history of the {p} t t+ g is constructed, at the second allow, right : A part of the shower evolution of the 
{p}bi+ v bi-p+g- The black lines represent the particles generated according to the tree level cross section. The red 
lines represent particles generated with the shower evolution. 


the decay products of the top quark and those of the antitop quark are not produced correctly, and the 
off shell effects of the top and antitop quarks are also absent. One solution to these issues is to generate 
the event samples which include a tt decay as a part of the hard process, according to the exact tree level 
matrix elements. 


In our study, the event generation of the tt production including a tt decay is performed as follows. Let us 
consider an event sample {p} bl + ub i~ v+g : which originates from a {p} tb+g and is generated with the tree level 
matrix elements. At first, the {p} tb+g is reconstructed as an intermediate process from the {p} b i+ vb i~ D _ j. • The 
top and antitop quarks are not necessarily on-shell. Next, the program for the shower history construction 
is executed on the {p} tb+g , and hence a {p} tb and a clustering scale t 1 are obtained as a shower history. The 
sequence of the procedures is illustrated in the left picture of Figure [5] The first Sudakov form factor in the 
evolution equation in eq. (2.22) will be calculated based on the {p} tf and the t 1 , i.e. II(f^, t 1 ; {p} tb )- The 
calculation of the second Sudakov form factor II(t 1 , {p} tb+2 < Qc U u' '' i {p} t t+g) l°oks a little tricky, because 
the shower generator must be executed on the {p} tb+g for the calculation of the second Sudakov form factor, 
while it must be executed on the {p} b i+ vb i- v+g f° r the event generation. In fact, when the top and antitop 

the PYTHIA8 


quarks are present as intermediate particles in a Les Houches event file of the {p} 


6Z+1/6Z - v-\-g' > 


parton shower program starts the shower evolution of the {p} tb+g at first. This implementation makes 
the calculation of n(t 1; {p} tb+2 < Qcut;''' i {p} t t+g) possible. Once the shower evolution of the {p} t ^ +0 is 


tt+g 


completed, the kinematic change of the tt due to the evolution is reflected into the kinematics of its decay 
products i.e. bl + vbl~D. Finally, the shower evolution of the decay products is performed in such a way that 
it does not change the invariant mass of the top quark and that of the antitop quark |40| . The right picture 
of Figure [5l schematically shows a part of the shower evolution of the { P\u+ vb i-v+g ■ The black lines represent 
the particles generated according to the tree level cross section. The red lines represent particles generated 
with the shower evolution. Note that the first radiation off the bottom quark is also corrected internally by 
using the differential decay width dT{t —>• bW + g) pT?l . 


2.6 Event generation 

In this section, we describe a procedure for the event generation of the top quark pair production in detail, 
by combining the knowledges given in the above sections. 


1. Generate the event samples for the tt + 0,1,.. ., N partons production processes at proton proton (pp) 
collisions according to the tree level cross sections, i.e. {p} t j, {p} t .i+i ■> ''' j {p} t i+ N ■ When a decay of 
the tt is to be simulated, the event samples are generated according to the tree level cross sections 
including the decay of the tt, i.e. {p} bl+iybl - D , {?} u + v n- 9+v ''' , {p} w +y w - 5+ jy In this P a P e L onl y the 
dilepton decay is studied. We let N denotes the maximal number of partons provided by the tree level 
cross section. We use MadGraph5_aMC@NLO[4j. version 5.2.2.1 for this purpose. The merging scale 


18 




Q cut is defined by the longitudinal-boost invariant k ± variable [39] 


k±iB — Pti-i (2.44a) 

k_Lij = min (j>thPt 3 )\J(Vi ~ Vj ) 2 + (</>i ~ $j) 2 /R, (2.44b) 


where p Ti , y i and <p i are the transverse momentum with respect to the beam, rapidity and azimuthal 
angle of outgoing particle i. R is the radius parameter and R = 1 is used if not otherwise specified. The 
merging scale cut is imposed only on the light partons, and no cut is imposed on the tt and its decay 
products. A fixed value t A is used for the scales in the strong couplings and in the parton distribution 
functions (PDFs). The PDF set CTEQ6L1 [33] is used. The center of mass energy for the pp collisions 
is 14 TeV, except in the case that the simulation is compared with the data at the 7 TeV. 

2. Select an event sample for the tt + n partons process, i.e. {p} t ^r +n , or {p} w +^;-p+n w ^ en the tt is 
decayed, with the probability proportional to its integrated tree level cross section obtained in step 1, 


<r(pp tt + n) 

E*=o a (PP —ttt + i) 


(2.45) 


3. Construct a parton shower history of the {p} t j +n by following the procedure described in Section 2.3 
The history consists of intermediate events {p} tf+(rl _ 1 ), {p} t t + („- 2 )> ‘ ‘' > {p}ti+v ''' > {rftt+ii {p} t t 
the scales t n < t n _ 1 < ■ ■ ■ < t i+1 < ■ ■ ■ < t 2 < t 1 . When the tt is decayed, a {p} tb+n is reconstructed 
at first from the {p} bl + ubl - 9+n as described in Section 


2.5 


4. When the CKKW-L al gorith m is used, the merging scale cut is imposed on the intermediate events, 
too, as indicated in eq. (2.211. Thus, the event sample is vetoed as a whole, unless {p} t ^ +i > Q 2 ut for 
i = 1,2, • • • , n — 2, n — 1. This is not the case in the CKKW-L+ algorithm. 


5. Calculate the weight function for the {p} tf _ r , by using eqs. (2.42) and (2.43). The scale t x must be 


determined from the intermediate event {p} t T and we use 


• tt-\-n 


t x — Erp(t) x E T (t ), 


(2.46) 


where iff, = m 2 +pf. Note that when the tt is on-shell, E T (t) = E T (t) thus t x = E^{t). This is not 
the case when the decay of the tt is also generated in step 1, since the reconstructed tt is not necessarily 
on-shell. This scale t x is also used as the renormalization scales of the strong couplings a 2 for the 
hard process, that is, 


<Xs(tx) 

«s(*a) 


(2.47) 


should be added in the weight function. We use a s (m z ) = 0.13 in the above factor. When the 
construction of the shower history was abandoned in step 3, it is not possible to calculate the weight 


function in the given way. In such a case, the weight function is given by, instead of eqs. (2.42) and 


(2.43), 


a? s (t x ) oc s {p 2 T {l))a s {p 2 T {2)) ■ ■■a s {p 2 T (n)) f[ n) (x^,t x ) (x ( 2 n) ,t x ) 


K*a) 


K*a) 


fW, m , \ An), (n) , \ 
J 1 > l A) J 2 j 1 aJ 


(2.48) 


where p T {i) is the transverse momentum of a parton i in the {p} tb+n - The scale t x is defined by 
eq. (2.4 6[) an d now is determined from the {p} tb+n - We use a s {m z ) = 0.13 for all the strong couplings 
in eq. (2.48). Once the weight function is obtained, the {p} t ^ +n is re-weighted with the function. 
However, the weight function is not bounded above by unity. Therefore, the upper bound of the weight 
function must be found at first by calculating the weight function for a large number of {p} tb+n - The 
integrated cross section obtained in step 1 has to be multiplied by the obtained upper bound of the 
weight function, of course. 
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Qcut dependence 


Qcut dependence 




Figure 6: The differential jet rates for 1 —»■ 0 (left) and 2 —» 1 (right) jets. The merging scale is set to Q cut = 20 
GeV for the blue sold curve and to Q cut = 60 GeV for the red dashed curve. The black broken curve represents the 
result without merging algorithms 


6. Calculate the Sudakov form factor(s) by following the procedure described in Section 2.2 When the 


construction of the shower history was abandoned in step 3, all the Sudakov form factors are set to 
unity. We use the parton shower model lT)j fU 42j in PYTHIA8 0305] version 8186. The default tune 
of the version 8186, tune 4C m, is basically used, while some functions are turned off. To simplify 
the analysis, the hadronization after the shower evolution and the multiple interaction are turned off. 
The rapidity ordering in the initial state radiation is turned off as suggested in ref. EDI- All functions 
inducing azimuthal asymmetry are turned off, since azimuthal angle information of hard partons is 
provided by exact tree level matrix elements in our simulation. 


7. Once the event sample is accepted and thus the shower evolution is performed until the shower cutoff 
scale in step 6, all visible particles in the final state within a rapidity range \y\ < 5.0 including charged 
leptons are clustered to construct inclusive jets according to the anti-fc T algorithm m ■ The tt will 
not be included, if it is not decayed. The radius parameter is R = 0.4 if not otherwise specified. We 
use Fastjet @S] version 3.1.0 for this purpose. The rapidity and p T cuts on jets will be specified in the 
studies of Section [3j 


8 . 


Repeat the above procedures from step 2 to step 7 until a large number of the accepted event samples 
are accumulated. 


2.7 Tests of our implementation 

In this section, our implementation of the CKKW-L+ merging algorithm is tested. At first, the dependence 
of differential jet rates on the merging scale cutoff Q cut and that on the parton shower starting scale are 
studied. A comparison with experimental data is also presented. 


Differential jet rates are calculated by using the longitudinal-boost invariant k ± definition in eq. (2.44) 
with the radius parameter R = 1. In Figure[6]the differential jet rates for 1 —J- 0 (left) and 2 —>■ 1 (right) jets 
are plotted. The maximal number of partons N predicted by the tree level cross section is N = 3, i.e. the 
event samples are generated exactly according to eq. (2.24). A vertical dashed line indicates the merging 
scale Q cut . The merging scale is set to Qcut = 20 GeV for the blue sold curve and to Q cut = 60 GeV for the 
red dashed curve. The black broken curve represents the result without merging algorithms i.e. purely the 
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Shower starting scale dependence 


Shower starting scale dependence 




M 1-»0)[GeV] /c ± (2->1) [GeV] 


Figure 7: The differential jet rates for 1 —> 0 (left) and 2 —» 1 (right) jets. The dependence on the parton shower 
starting scale is studied. The solid lines represent the results obtained with the merging algorithm and the dashed 
lines represent the results obtained without merging algorithms. Three different scale choices are considered for the 
shower starting scale, namely s = (14) 2 TeV 2 , t x and t x / 4. 


shower prediction. The three curves are set equal at the bin between 10 and 12 GeV for comparison. 

The obtained results are distributed smoothly around the merging scales. A clear difference is, however, 
observed by varying the merging scale from 20 GeV to 60 GeV. The reason for this is that the results 
obtained with the merging algorithm already deviate from the parton shower prediction in the soft and/or 
collinear region. BY comparing the blue solid curve and the black broken curve, it is clear that the result 
obtained with the merging algorithm starts to deviate from the shower prediction at around the merging 
scale (20 GeV). The same can be observed for the red dashed curve. This behavior is quite natural, consid¬ 
ering the idea of merging algorithms, namely parton showers are populated below the merging scale and tree 
level matrix elements are populated above the merging scale. The observation of the clear difference does 
not imply a fault in our implementation of the CKKW-L+ merging algorithm, but suggests that we should 
choose a smaller value for the merging scale for the tt pair production when the current shower model is used. 

As the second test, the dependence on the parton shower starting scale is studied. The parton shower 
starting scale has been explained at the end of Section [272] Although the most natural choice for the parton 
shower starting scale is the scale t Xl we have argued that predictions such as a jet p T distribution should be 
insensitive to the parton shower starting scale. We have also mentioned that the inclusive cross section can 
be sensitive to it. We confirm these statements in the following. We consider three different scale choices 
for the parton shower starting scale, namely s = (14) 2 TeV 2 , t x and t x / 4. We use N = 3 and Q cut = 20 
GeV. The results are shown as the differential jet rates for 1 —> 0 (left) and 2 —>• 1 (right) jets in Figure [7] 
The solid lines represent the results obtained with the merging algorithm and the dashed lines represent the 
results obtained without merging algorithms. It is clearly shown that the dependence on the parton shower 
starting scale is reduced significantly by the merging algorithm. 


The inclusive cross section and the ratio of the exclusive cross section to the inclusive cross section are 
introduced in eqs. (2.25) and (2.26). They are shown in Table [l] with the different scale choices for the 
parton shower starting scale. Note that the inclusive cross section without merging algorithms is 562 pb. 
While the inclusive cross section is sensitive to the shower starting scale, the ratios are little affected. This 
is exactly what we have argued at the end of Section 2.2 The small increase of cr exc (tt + 3)/er inc for the 
t x /4 may explain the small enhancement at the high k ± region in Figure [Tj 
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CT inc(“) (P b ) 

°exc( tt + 0)/cr. nc (tt) 

0 'exc( <t + 1 )/°'inc(tt) 

°exc( tt + 

cr exc(tt + 3 )/ 0 'i n c( t *) 

s 

346 

0.28 

0.33 

0.22 

0.17 

t x 

422 

0.28 

0.33 

0.22 

0.17 

W/4 

556 

0.27 

0.33 

0.22 

0.18 


Table 1: The inclusive cross section and the ratio of the exclusive cross section to the inclusive cross section, with 
the different choices for the shower starting scale. The inclusive cross section without merging algorithms is 562 pb. 


The discussion and the results up to now have assumed that the tt is stable. Three jet observables are 
produced from the event samples including the leptonic decays of the top and antitop quarks and plotted in 
Figure [8j We use N = 2 and Q cut = 20 GeV. The center of mass energy for the pp collisions is 7 TeV. The 
differential cross section as a function of jet multiplicity is shown in the left panel. The last bin includes the 
contribution of .ZVj ets > 6. The gap fraction is defined as 


f(x) 


N(x) 

N. 


total 


(2.49) 


where N(x) is the number of events that give x less than the given value of x. The gap fraction as a 
function of the p T of the highest p T additional jet is shown in the middle panel and that as a function of 
the scalar sum of the p T of the additional jets is shown in the right panel. The additional jets are defined as 
all jets not including the two highest p T b jets. The green dashed curve shows the result without merging 
algorithms. The red solid and the blue dotted curves show the results obtained with the merging algorithm. 
The parton shower starting scale is set to the t x for the red solid curve and to the t x /4 for the blue dotted 
curve. Our predictions are compared to the data from the CMS experiment [35]. It is shown that the 
merging algorithm gives the better description of the data in all the three jet observables. It can also be 
confirmed that the predictions are stable under the variation of the shower starting scale between the t x 
and the t x /4. 


Smaller inclusive cross sections obtained with merging algorithms imply less efficient event generation. 
We therefore choose the scale t x /4 for the parton shower starting scale for generating the event samples to 
be analyzed in the following sections. 


3 Azimuthal angle correlation 


In this section, the azimuthal angle difference between the two hardest jets in the top quark pair production 
process is studied, by using the event samples generated with the merging algorithm described in Section [2] 
In Section [rTj the detailed definition of the azimuthal angle difference is specified. In Section 3.2 we discuss 
the requirements on the merging parameters in order to obtain an accurate prediction of the azimuthal angle 
difference. A numerical comparison of our merging algorithm with the CKKW-L algorithm is also presented. 
In Section |3.3[ the distribution of the azimuthal angle difference is shown. The result obtained by merging 
the matrix elements for the tt plus up to 2 partons is compared to the one obtained by merging the matrix 
elements for the tt plus up to 3 partons. The roles of the tt + 3 partons matrix elements are studied in detail. 
It is assumed that the top and antitop quarks are stable in Section |3.3| In Section 3.4 the distribution of 


the azimuthal angle difference is produced from the event samples including the dilepton decay of the tt and 
the effect of the decay is studied. 
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Figure 8: The differential cross section as a function of jet multiplicity (left panel). The gap fraction as a function 
of the p T of the highest p T additional jet (middle panel) and that as a function of the scalar sum of the p T of the 
additional jets (right panel). The green dashed curve shows the result without merging algorithms. The red solid 
and the blue dotted curves show the results obtained with the merging algorithm. The shower starting scale is set to 
the t x for the red solid and to the t x /4 for the blue dotted. The merging parameters are set to N = 2 and Q cut = 20 
GeV. Compared to the data from the CMS experiment [49j . 


3.1 Definition 

An event sample with two or more jets is picked up and the following requirements which are often called 
vector boson fusion (VBF) cuts are applied to the two hardest jets, 

y 1 xy 2 <0, \y 1 -y 2 \>4. (3.1) 

The transverse momentum p T with respect to the beam of an object describes the hardness of the object. 
Therefore a jet which has the highest p T is called the hardest jet and another jet which has the second highest 
p T is called the second hardest jet, and these jets are assigned to the two hardest jets. One of the two jets 
which has a positive rapidity is chosen for an azimuthal angle 0 X and the other jet which has a negative 
rapidity is chosen for an azimuthal angle 0 2 • The azimuthal angle difference between the two hardest jets is 
defined by 


A</> =</>! — 02 ■ 


(3.2) 


Therefore a jet for the <p 1 is not necessarily the hardest jet in our definition. To enhance the correlation in 
A0, a cut is applied on the invariant mass of the top quark pair m , 


m+f < 500 GeV. 


(3.3) 


No other cuts are applied to the top and anti-top quarks. 


All the event samples analyzed in the following sections satisfy the above cuts in eqs. (3.1) and (3.3), 
even when it is not stated explicitly. This is also the case, when we analyze exclusive parton level event 


samples in Section 3.3 There, the cuts in eq. (3.1) are applied on partons, not jets. 


3.2 Contamination 

In this section, we discuss the requirements on the merging parameters in order to obtain an accurate 
prediction on the azimuthal angle difference. We explore a relation between the merging scale Q cut and a 
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PTcut (GeV) 

20 

25 

30 

CKKW-L 

12.13 

7.00 

4.50 

CKKW-L+ 

11.65 

6.70 

4.29 


Table 2: The contamination (%) for the two different merging algorithms and for the three different p Tcut values, 
defined in eq. (3.6 1 . The rapidity cut on jets is \y\ < 4.5. The merging parameters are set to IV = 3 and Q cut = 20 
GeV. 


lower cut on the p T of jets by which the contamination is negligible. A numerical comparison of our merging 
algorithm with the CKKW-L algorithm is also presented. 

In order for each of the two hardest jets to have the correct azimuthal angle information, each of them 
must have its origin in a parton generated with matrix elements. If one or both of them originate from 
partons generated with parton showers, angular correlations between them are not correct. One of the 
requirements on the merging parameters is therefore 

N > 2. (3.4) 


Another requirement is that the merging scale Q cut is chosen smaller than the scale of a jet definition. 
The anti-fc T algorithm m includes two parameters, namely the radius parameter R and a lower cutoff on 
the transverse momentum of jets p Tcut . The radius parameter will not be restricted so much, since the two 
hardest jets which are well separated with each other are of interest to us, see eq. (3.11. We choose R = 1 
in the merging scale definition in eq. (2.44), while R = 0.4 is used in the jet definition. The p Tcut has to 
satisfy | 281 


^3cut — PTcut * 


(3.5) 


As we have already mentioned several times, the contamination is the contribution from the tt + 0,1 parton 
matrix elements to the event samples with two or more jets. The contamination can be written by using the 
inclusive cross section and the exclusive cross sections as 


°exc( tt + 0) + °exc( tt + 1) 


CT i„c(tt) 


(3.6) 


This notation is introduced in eq. (2.25). The A<j) prediction will not be reliable, unless the contamination 
is small. The event samples are generated with the two different merging algorithms, namely the CKKW-L 
and the CKKW-L+. Then, the contamination is calculated for the three different p Tcut values, namely 
20, 25 and 30 GeV. The rapidity cut on jets is set to \y\ < 4.5. We use N = 3 and Q cut = 20 GeV, 
thus the above two requirements are satisfied. The result is shown in units of percentage in Table [2] 
The contamination decreases with a rise in the p Tcnt , as expected. However, the contamination is not so 
suppressed in the CKKW-L+ algorithm, compared to the CKKW-L algorithm. 


Our introduction of the CKKW-L+ algorithm has been motivated by our numerical finding that the 
contamination is not negligible in the CKKW-L algorithm when the Q cut is set equal to or slightly smaller 
than the p Tcut . However, it is found that a large suppression of the contamination cannot be achieved in 
the CKKW-L+ algorithm. 


In the following sections, we carry on the further analyses on the event samples generated with the 
CKKW-L+ algorithm. We choose Q ±cut = 20 GeV and p Tcut = 30 GeV, in order to suppress the contam¬ 
ination reasonably while avoiding the event generation which is too inefficient. Note that about 4% of the 
correlation in A (j> is already lost due to the contamination in the results presented in the following sections. 
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Figure 9: The A <j> distribution produced from the event samples generated with the merging algorithm for N = 2 
(red dotted curve) and from those for N = 3 (blue solid curve). The rapidity cut on jets is \y\ < 4.5 in the left panel 
and |y| < 3.0 in the right panel. 


3.3 The effects of the tt+ 3 partons matrix elements 

The requirement on the maximal number of partons N provided by the tree level matrix elements (MEs) 


i.e. IV > 2 is briefly discussed in Section 3.2 In general the more accurate description of multi-jet processes 


is expected by larger values for TV. Thus, the prediction on the A 0 distribution is also expected to be more 
accurate for TV = 3 than for IV = 2. In this section, we produce the A <j> distribution by using the event 
samples generated with the merging algorithm for TV = 2 and by using those for TV = 3. Then, the two 
results are compared. The effects of the tt + 3 partons matrix elements on A <j> are studied in detail. 


We show the A(f> distribution produced from the event samples generated with the merging algorithm 
for TV = 2 (red dotted curve) and from those for TV = 3 (blue solid curve) in Figure [9j The rapidity cut 
on jets is \y\ < 4.5 in the left panel and \y\ < 3.0 in the right panel. The total inclusive cross section for 
m t = 172.5 GeV is estimated to be 960pb from ref. m- From this value, the cross section at each bin is 
calculated. The results show strong correlations in A (f>, as predicted in the analysis based on the tt + 2 
partons tree level matrix elements m- This observation indicates that the correlation found in the tt + 2 
partons tree level matrix elements [13j is still present after including the dominant higher order corrections 
and thus can be observed in the experiments. We can find a clear difference in the A <j> distribution between 
the result for TV = 2 and the one for TV = 3. The difference looks slightly larger in the right panel, where 
the rapidity range is more restricted. The origins of the difference are studied in the following. 


First of all, let us remind us of one point in the merging algorithm. When the parton shower 
(PS) generator is executed on a matrix element (ME) event sample for the highest parton multiplicity 
TV i.e. {p} t t +N , the shower is constrained to be softer than the TV partons of the {p} t f +N in terms 
of the shower evolution variable. More precisely, following a parton shower history of the {p} t j +N 
consisting of M«+(jv- 2)> ‘ ‘ ‘ ’ Mtt+W' ‘ > M«+i> Mtt with the corresponding scales 

t N < t N -\ < • • • < t i+ 1 < ■ ■ ■ < t 2 < t ± , the evolution scale of the shower is restricted to be below the t N . 


Now let us consider the merging algorithm for N = 2 and suppose that the shower evolution of a 
ME event sample {p} t ^ +2 performed and thus an event sample {p} t ^ +3 is generated. When the shower 
evolution generates an initial state radiation, one parton will be added in the {p} t j +2 and accordingly 


the kinematics of the two partons of the {p\ t i +2 will be changed. In the left panel of Figure 


10 


the 
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Figure 10: left-. Event samples {p} t t +3 obtained by one initial state shower evolution of the ME samples {p} t t +2 are 
plotted. The vertical axis represents p T of the added parton. The horizontal axis represents the p T of the second 
hardest parton of the {p} t t +2 after the evolution, right Event samples {p} t t+i obtained by one initial state shower 
evolution of the ME samples {p} t j +3 are plotted. The vertical axis represents p T of the added parton. The horizontal 
axis represents the p T of the second hardest parton of the {p} t f +3 after the evolution. 


p T of the added parton by the initial state radiation is assigned on the vertical axis and the p T of the 
second hardest parton of the {p} t jr , 2 after the initial state radiation is assigned on the horizontal axis. 
It must be noted that the kinematics of the {p } t ^, 2 after the initial state radiation is not identical to 
that of the original ME event sample {p} t j +2 anymore, since it has been changed by the radiation. The 
rapidity range for the two partons which construct A<f> is \y\ < 4.5. It is not required, however, that 
all of the three partons in the {p} t ^, 3 are within the range \y\ < 4.5. From the left panel, we can find 
that the added parton does not have a lower p T than the two partons of the {p} t ^ +2 in the considerable 
fraction (10.0%), which is shown by the dots in the upper left of the panel. This observation implies a 
non-negligible loss of the correlation between the two hardest jets, because the added parton which has 
the highest or second highest p T may give rise to one of the two hardest jets at the end of the shower evolution. 


The above problem can be solved by merging the matrix elements for the tt +3 partons. Let us consider 
the merging algorithm for N = 3 and suppose that the shower evolution of a ME event sample {p} t ^ +3 is 
performed and thus an event sample {p} t f +4 is generated. When the shower evolution generates an initial 
state radiation, one parton will be added in the {p} t i +3 and accordingly the kinematics of the three partons 
of the original {p} t j +3 will be changed. In the right panel of Figure 10 the p T of the added parton by the 
initial state radiation is assigned on the vertical axis and the p T of the second hardest parton of the {p} t j +3 
after the initial state radiation is assigned on the horizontal axis. The rapidity range is the same as above. 
The panel shows that the probability that the added parton has the highest or second highest p T is quite 
suppressed (0.3%). 


Our discussion so far has been based on the exclusive parton level event samples, not on the inclusive 
jet level event samples. However, we believe that our numerical findings reasonably explain the difference 
observed in Figure [9] In the event samples generated with the merging algorithm for N = 2, a non-negligible 
loss (5 ~ 10%) of the correlation between the two hardest jets can be unavoidable, because a jet originating 
from hard parton showers can have a higher p T than one of the two jets originating from the two hard 
partons of the tt+2 partons matrix elements. In the event samples generated with the merging algorithm 
for N = 3, the loss of the correlation due to the above reason can be avoided as follows. Jets originating 
from hard parton showers cannot have higher p T than two of the three jets originating from the three hard 
partons of the tt +3 partons matrix elements. 
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Figure 11: As in the merging algorithm for N = 2, event samples {p} t t +3 are generated at first by the initial state 
shower evolution of the ME event samples {p} t t. l 2 - And then only the miss-tagged event samples are plotted. The 
axes are identical to those in the left panel of Figure [To| The rapidity range for the two partons which construct A tj> 
is \y\ < 4.5 in the left panel and \y\ < 3.0 in the right panel. 


When the rapidity cut on jets is more stringent, a further loss of the correlation may arise, because 
jets originating from the hard partons of matrix elements may be removed by the rapidity cut. In order 
to examine the possibility of this, let us define a miss-tagged event sample as an event sample in which a 
parton added by parton showers or a jet originating from parton showers is picked up for constructing the 
azimuthal angle difference A</>. In miss-tagged event samples, the A</> distribution will not be produced 
correctly, of course. 


As in the merging algorithm for N = 2, event samples {p} t jr +3 are generated by the initial state shower 
evolution of the ME event samples {p} t j +2 - Then, only the miss-tagged event samples are plotted in the 
left panel of Figure 11 The rapidity range for partons is \y\ < 4.5. It must be noted that the rapidity range 
\y\ < 4.5 is for the two partons which construct A cf) and thus it is not required that all of the three partons 
in the {p} t ^ +3 are within the range \y\ < 4.5. The vertical and horizontal axes are identical to those in the 
left panel of Figure [lOj The miss-tagged event samples in the upper left of the panel correspond to the 
samples in which the added parton has a higher p T than one of the two partons of the {p} t ^, 2 and thus the 
added parton is picked up for A (j>. The miss-tagged event samples in the lower right of the panel correspond 
to the samples in which one of the two partons of the {p} t i +2 removed by the rapidity cut and thus the 
added parton is picked up for A<fi, despite that the added parton has a lower p T . The panel shows that the 
former possibility is dominant (96.7%). The miss-tagged fraction in the samples {p} t j +3 is 10.3%. 


The miss-tagged event samples {p} tf+3 with a more restricted rapidity range |y| < 3.0 are plotted in the 
right panel of Figure 11 The panel shows that one of the two partons of the {p} t f +2 is removed by the 
rapidity cut in the larger fraction (48.3%) of the samples, as expected. As a result, the miss-tagged fraction 
in the samples {p} t j +3 is increased to 16.6%. 


In event samples {p} t ^ +4 generated by the initial state shower evolution of the ME event samples {p} t f +3 
as in the merging algorithm for N = 3, the miss-tagged fraction is 0.4% for a rapidity cut |y| < 4.5 and 
2.0% for a rapidity cut \y\ < 3.0. The miss-tagged fraction in the {p} t ^ +3 is more increased, compared to 
the {p} t j +4 , as the rapidity cut is set more stringent. 


Although the above discussion is again based on the exclusive parton level event samples, we believe that 
our findings reasonably explain the slightly larger difference observed in the right panel of Figure [9] In the 
event samples generated with the merging algorithm for N = 2, a further loss of the correlation between the 
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Figure 12: The A <j> distribution produced from the event samples generated by the merging algorithm for N = 2, 
including the dilepton tt decay (blue solid) and not including tt decays (red dotted). 


hardest two jets arises when the rapidity cut on jets is set stringent, because one of the two jets originating 
from the two hard partons of the tt+2 partons matrix elements can be removed by the rapidity cut. In 
the event samples generated with the merging algorithm for N = 3, the loss of the correlation due to the 
above reason can be reduced as follows. Even though one of the three jets originating from the three hard 
partons of the tt +3 partons matrix elements is removed by the rapidity cut, there are still two jets which 
correctly predict the correlation. When the rapidity range for jets is \y\ < 4.5, the loss of the correlation 
due to the above reason can be negligible. However, when the rapidity range is more restricted such as 
\y\ < 3.0, the loss of the correlation cannot be negligible anymore in the event samples generated with 
the merging algorithm for N = 2, while it can be much reduced in the event samples generated with the 
merging algorithm for N = 3. This observation can explain the larger difference in the right panel of Figure[9] 

To summarize, a non-negligible loss of the correlation between the two hardest jets will be unavoidable 
in the event samples generated with the merging algorithm for TV = 2. The loss of the correlation can be 
reduced significantly in the event samples generated with the merging algorithm for TV = 3. The role of the 
tt+3 partons matrix elements can be more important as the rapidity range for jets is set more restricted. 


3.4 The effect of a tt decay 

In the studies of the previous section, it is assumed that the top and antitop quarks are stable. This 
assumption can be justified, since the purposes of the previous section are to confirm that the correlation in 
A cf) found in the tt + 2 partons matrix elements [13] is still visible after including the dominant QCD higher 
order corrections, and to study the effects of the tt + 3 partons matrix elements on A <j>. In this section, the 
effect of a tt decay on Ac/) is studied. 

We produce the A (f> distribution from the event samples including the dilepton tt decay generated by 
the merging algorithm for N = 2. Following ref. [J9]> the following kinematic constraints are imposed. Two 
oppositely charged leptons (muon or electron) are required to have p T > 20 GeV within the rapidity range 
\y\ < 2.4. Jets are rejected if the selected leptons are within a cone of A R = 0.4 with respect to the jet. A 
jet is identified as a b jet if it contains at least one b quark. At least two b jets which fulfill p T > 30 GeV 
and \y\ < 2.4 are required. The additional jets are defined as all jets not including the two highest p T b 
jets. The two hardest jets for A <j> are picked up from the additional jets. We set the rapidity range for the 
additional jets as \y\ < 4.5. In our result, the tt invariant mass is calculated from the kinematic information 
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of the top and antitop quarks obtained just before their decays. 


The result is shown by the blue solid curve in Figure 12 The red dotted curve is the result without tt 


decays and is given for comparison. Note that the cross section for the result without tt decays is set equal 
to the one for the result with the ti decay. The figure shows that the effect of the tt decay on A</> is small 
and the correlation is still visible. 


4 Conclusion 

In this work, the azimuthal angle difference between the two hardest jets (i.e. the two highest p T jets), 
A <j> = — 02 i R the top quark pair production process at the 14 TeV LHC has been studied. The event 

samples are generated by merging the tree level matrix elements for the ti plus up to 2 or 3 partons with 
the parton shower model in PYTHIA8. 

As tree level merging algorithms, we have implemented the CKKW-L algorithm and a new algorithm. 
Our new algorithm differs from the CKKW-L algorithm in the strategy for phase space separation and 
it is designed so that the contribution from the ti + 0,1 parton matrix elements to the event samples 
with two or more jets, which we call the contamination, can be more suppressed above the merging scale. 
Although it has been confirmed that the contamination is more suppressed in our algorithm by numerically 
comparing the two algorithms, the difference is found not drastic. We find that the contamination is not 
negligible when the merging scale is set equal to or slightly smaller than the scale of the anti-fc T jet definition. 

The A (f> distribution is produced from the generated event samples. The distribution shows a strong 
correlation in A</>, as predicted in the previous analysis m based on the tt + 2 partons tree level matrix 
elements. This observation confirms that the correlation found in the ti + 2 partons tree level matrix 
elements m is still visible after including the dominant QCD higher order corrections and thus can be 
observed in the experiments. 

We find a clear difference in the A</> distribution between the event samples generated by merging the 
tree level matrix elements for the ti plus up to 2 partons and those generated by merging the tree level 
matrix elements for the ti plus up to 3 partons. Furthermore, the difference is found slightly larger, when 
the rapidity range for jets is more restricted. We have studied the origins of the difference, or in other 
words the effects of the ti + 3 partons matrix elements on A</>. When the matrix elements for the ti plus up 
to 2 partons are merged, a non-negligible fraction (5 — 10%) of the correlation between the two hardest jets 
can be lost because a jet originating from hard parton showers can have a higher p T than one of the two 
jets originating from the two hard partons of the ti + 2 partons matrix elements. When the rapidity range 
for jets is more restricted, a further loss of the correlation arises because one of the two hard jets can be 
removed by the rapidity cut. When the matrix elements for the ti plus up to 3 partons are merged, the loss 
of the correlation due to the above two reasons can be avoided as follows. At first, jets originating from hard 
parton showers cannot have higher p T than two of the three jets originating from the three hard partons of 
the ti+ 3 partons matrix elements. Second, even though one of the three hard jets is removed by the rapidity 
cut, there are still two jets which correctly predict the correlation. Therefore, it can be concluded that, the 
ti + 3 partons matrix elements play important roles in predicting A <f> accurately, since they effectively re¬ 
duce the loss of the correlation. They can be more important as the rapidity cut on jets is set more stringent. 

We present a method for merging the matrix element event samples which include a ti decay as a part 
of the hard process with the parton shower. The effect of the ti dilepton decay on A <fi is studied. We have 
shown that the effect is small when the two hardest jets for Acf> are picked up from all jets not including the 
two hardest b jets. 

We note that our findings should be applicable equally to other heavy particle production processes by 
gluon fusion. We hope that our findings help experimentalists perform the proposal of ref. m and achieve 
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precise measurement of the CP property of the Higgs boson by using the azimuthal angle correlation between 
the two hardest jets at the LHC. 
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